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Abstract. This paper proposes a method to address the longstanding problem of lack of 
monotonicity in estimation of conditional and structural quantile functions, also known as 
the quantile crossing problem. The method consists in sorting or monotone rearranging the 
original estimated non-monotone curve into a monotone rearranged curve. We show that the 
rearranged curve is closer to the true quantile curve in finite samples than the original curve, 
establish a functional delta method for rearrangement-related operators, and derive functional 
limit theory for the entire rearranged curve and its functionals. We also establish validity of 
the bootstrap for estimating the limit law of the the entire rearranged curve and its func- 
tionals. Our limit results are generic in that they apply to every estimator of a monotone 
econometric function, provided that the estimator satisfies a functional central limit theorem 
and the function satisfies some smoothness conditions. Consequently, our results apply to 
estimation of other econometric functions with monotonicity restrictions, such as demand, 
production, distribution, and structural distribution functions. We illustrate the results with 
an application to estimation of structural quantile functions using data on Vietnam veteran 
status and earnings. 
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1. Introduction 

This paper addresses the longstanding problem of lack of monotonicity in the estimation 
of conditional and structural quantile functions, also known as the quantile crossing problem 
(He, 1997). The most common approach to estimating quantile curves is to fit a curve, of- 
ten linear, pointwise for each probability indexO Researchers use this approach for a number 
of reasons, including parsimony of the resulting approximations and excellent computational 
properties. The resulting fits, however, may not respect a logical monotonicity requirement 
- that the quantile curve should be increasing as a function of the probability index. This 
paper introduces a natural monotonization of these empirical curves by sampling from the es- 
timated non-monotone model, and then taking the resulting conditional quantile curves which 
by construction are monotone in the probability index. This construction of the monotone 
curve may be seen as a bootstrap and as a sorting or monotone rearrangement of the original 
non-monotone function (see Hardy et al., 1952, and references given below). We show that the 
rearranged curve is closer to the true quantile curve in finite samples than the original curve 
is, and derive functional limit distribution theory for the rearranged curve to perform simul- 
taneous inference on the entire quantile function. Our theory applies to both dependent and 
independent data, and to a wide variety of original estimators, with only the requirement that 
they satisfy a functional central limit theorem. Our results also apply to many other economet- 
ric problems with monotonicity restrictions, such as demand and production functions, option 
pricing functions, yield curves, distribution functions, and structural quantile functions (see 
Matzkin, 1994, for more examples and additional references). As an example, we provide an 
empirical application to estimation of structural distribution and quantile functions based on 
Abadie (2002) and Chernozhukov and Hansen (2005, 2006). 

There exist other methods to obtain monotonic fits for conditional quantile functions. He 
(1997), for example, proposed to impose a location-scale regression model, which naturally 
satisfies monotonicity. This approach is fruitful for location-scale situations, but in numerous 
cases the data do not satisfy the location-scale paradigm, as discussed in Lehmann (1974), 
Doksum (1974), and Koenker (2005). Koenker and Ng (2005) developed a computational 
method for quantile regression that imposes the non-crossing constraints in simultaneous fit- 
ting of a finite number of quantile curves. The statistical properties of this method have yet to 
be studied, and the method does not immediately apply to other quantile estimation methods. 
Mammen (1991) proposed two-step estimators, with mean estimation in the first step followed 



1 This includes all principal approaches to estimation of conditional quantile functions, such as the canonical 
quantile regression of Koenker and Bassett (1978) and censored quantile regression of Powell (1986). This also 
includes principal approaches to estimation of structural quantile functions, such as the instrumental quantile 
regression methods via control functions of Imbens and Newey (2001), Blundell and Powell (2003), Chesher 
(2003), and Koenker and Ma (2006), and instrumental quantile regression estimators of Chernozhukov and 
Hansen (2005, 2006). 
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by isotonization in the second^ Similarly to Mammen (1991), we can employ quantile estima- 
tion in the first step followed by isotonization in the second, obtaining an interesting method 
whose properties have yet to be studied. In contrast, our method uses rearrangement rather 
than isotonization, and is much better suited for quantile applications. The reason is that iso- 
tonization is best suited for applications with (near) flat target functions, while rearrangement 
is best suited for applications with steep target functions, as in typical quantile applications. 
Indeed, in a numerical example closely matching our empirical application, presented in Sec- 
tion 3, rearrangement significantly outperforms isotonization. Finally, in an independent and 
contemporaneous work, Dette and Volgushev (2008) propose to obtain monotonic quantile 
curves by applying an integral transform to a local polynomial estimate of the conditional 
distribution function, and derive pointwise limit theory for this estimator. In contrast, we 
directly monotonize any generic estimate of a conditional quantile function and then derive 
generic functional limit theory for the entire monotonized curved 

In addition to resolving the longstanding problem of estimating quantile curves that avoid 
crossing, this paper develops a number of original theoretical results on rearrangement esti- 
mators. It therefore makes both practical and theoretical contributions to econometrics and 
statistics. In order to discuss these contributions more specifically, it is helpful first to review 
some of the relevant literature and available results. We begin by noting that the idea of 
rearrangement goes back at least to Chebyshev (see Bronstein et al., 2003, p. 31, Hardy et 
al., 1952, and Lorentz, 1953, among others). Rearrangements have been extensively used in 
functional analysis and operations research (Villani, 2003, and Carlier and Dana, 2005), but 
not in econometrics or statistics until recently. Recent research on rearrangements in statistics 
include the work of Fougeres (1997), which used rearrangement to produce a monotonic kernel 
density estimator and derived its uniform rates of convergence; Davydov and Zitikis (2005), 
which considered tests of monotonicity based on rearranged kernel mean regression; Dette et al. 
(2006) and Dette and Scheder (2006), which introduced smoothed rearrangements for kernel 
mean regressions and derived pointwise limit theory for these estimators; and Chernozhukov 
et al. (2006), which used univariate and multivariate rearrangements on point and interval 
estimators of monotone functions based on series and kernel regression estimators. In the con- 
text of our problem, rearrangement is also connected to the quantile regression bootstrap of 
Koenker (1994). In fact, our research grew from the realization that we could use this boot- 
strap for the purpose of monotonizing quantile regressions, and we discovered the link to the 
classical procedure of rearrangement later, while reading Villani (2003). 



2 Isotonization is also known as the "pool-adjacent-violators algorithm" in statistics and "ironing" in econom- 
ics. It amounts to projecting an initial estimate on the set of monotone functions. 

3 We refer to Dette and Volgushev (2008) for a nice, more detailed comparison of the two approaches. 
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The theoretical contributions of this paper are threefold. First, our paper derives functional 
limit theory for rearranged estimators and functional delta methods for rearrangement opera- 
tors, both of which are important original results. Second, the paper derives functional limit 
results for estimators obtained by rearrangement-related operations, which are also original 
results. For example, our theory includes as a special case the asymptotics of the conditional 
distribution function estimator based on quantile regression, whose properties have long re- 
mained unknown. Moreover, our limit theory applies to functions, encompassing the pointwise 
results. An attractive feature of our theoretical results is that they do not rely on independence 
of data, the particular estimation method used, or any parametric assumptions. They only 
require that a functional central limit theorem applies to the original estimator of the curve, 
and the population curves have some smoothness properties. Our results therefore apply to 
any quantile model and quantile estimator that satisfy these requirements. Third, our results 
immediately yield validity of the bootstrap for rearranged estimators, which is an important 
result for practice. 

We organize the rest of the paper as follows. In Section 2 we present some analytical results 
on rearrangement and then present all the main results; in Section 3 we provide an application 
and a numerical experiment that closely matches the application; and in Section 4 we give 
some concluding remarks. 

2. Rearrangement: Analytical and Empirical Properties 

In this section, we first describe rearrangement, then derive some basic analytical properties 
of the rearranged curves in the population, establish functional differentiability results, and 
finally establish functional limit theorems and other estimation properties. 

2.1. Rearrangement. We consider a target function u \— > Qq(u\x) that, for each x G X, 
maps (0, 1) to the real line and is increasing in u. Suppose that u i— > Q(u\x) is a parametric or 
nonparametric estimator of Qq(u\x). Throughout the paper, we use conditional and structural 
quantile estimation as the main application, where u i— > Qq(u\x) is the quantile function of a 
real response variable Y, given a vector of regressors X = x. Accordingly, we will usually refer 
to the functions u i— > Qq(u\x) as quantile functions throughout the paper. In other applications, 
such as estimation of conditional and structural distribution functions, other names would be 
appropriate and we need to accommodate different domains, as described in Remark 1 below. 

Typical estimation methods fit the quantile function Q{u\x) pointwise in u E (0,1)0 A 
problem that might occur is that the map u i— > Q(u\x) may not be increasing in u, which 
violates the logical monotonicity requirement. Another manifestation of this issue, known as 

4 See Koenker and Bassett (1978), Powell (1986), Chaudhuri (1991), Buchinsky and Hahn (1998), Yu and 
Jones (1998), Abadie et al. (2002), Honore et al. (2002), and Chernozhukov and Hansen (2006), among 
others, for examples of exogenous, censored, endogenous, nonparametric, and other types of quantile regression 
estimators. 
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the quantile crossing problem, is that the conditional quantile curves x i— > Q{u\x) may cross 
for different values of u (He, 1997). Similar issues also arise in estimation of conditional and 
structural distribution functions (Hall et al., 1999, and Abadie, 2002). 

We can transform the possibly non-monotone function u i— > Q(u\x) into a monotone function 
u i— > by quantile bootstrap or rearrangement. That is, we consider the random variable 

Y x := Q(U\x) where U ~ Uniform(W) with U = (0, 1), and take its quantile function denoted by 
u i— > Q*(u\x) instead of the original function u \— > Q(u|x). This variable has a distribution 
function: 



which is naturally monotone in the index u. Thus, starting with a possibly non- monotone 
original curve u \— » Q(u\x), the rearrangement (|2.ip - (|2.2p produces a monotone quantile curve 
u i — > Q*(u\x). Of course, the rearranged quantile function u i— ► Q*(u\x) coincides with the 
original function u <— > Q(u\x) if the original function is non-decreasing in u, but differs from it 
otherwise. 

The mechanism (l2.ip -( l2~T2~j) and its name have a direct relation to the rearrangement operator 
from functional analysis (Hardy et al., 1952), since u \— > Q*(u\x) is the monotone rearrangement 
of uh> Q(u\x). Equivalently, as we stated earlier, rearrangement has a direct relation to the 
quantile bootstrap (Koenker, 1994), since the rearranged quantile curve is the quantile function 
of the bootstrap variable produced by the estimated quantile model. Moreover, we refer the 
reader to Dette et al. (2006, p. 470) who, using a closely related motivation, introduced the 
idea of smoothed rearrangement, which produces smoothed versions of (|2.ip and (|2.2p . which 
can be valuable in applications. Finally, for practical and computational purposes, it is helpful 
to think of rearrangement as sorting. Indeed to compute the rearrangement of a continuous 
function u h-> Q(u\x) we simply set Q*(u\x) as the u-th quantile of {Q(u\\x), Q(uk\x)}, 
where {u%, ...«&} is a sufficiently fine net of equidistant indices in (0, 1). 

Remark 1. (Adjusting for domains other than the unit interval). Throughout the paper 
we assume that the domain of all the functions is the unit interval, U = (0, 1), but in many 
applications we may have to deal with different domains. For example, in quantile estimation 
problems, we may consider a subinterval (a, b) of the unit interval as the domain, in order to 
avoid estimation of tail quantiles. In distribution estimation problems, we may consider the 
entire real line as the domain. In such cases we can first transform these functions to have 
the unit interval as the domain. Concretely, suppose we have Q : (a, b) — > R. Then using any 
increasing bijective mapping (p : (a, b) i— > (0, 1), we can define Q := Q o ^ : (0, 1) -» R, and 
then proceed to obtain its rearrangement Q*. In the case where a / -oo and b ^ oo, we can 




(2.1) 



which is naturally monotone in the level y, and quantile function: 



Q*(u\x) := F (u\x) = inf{y : F(y\x) > u} 



(2.2) 
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take (p to be an affine mapping. In order to obtain the rearrangement of the original function 
Q, we then set Q* = Q* o ip. □ 
Let Q denote the pointwise probability limit of Q, which we will refer to as the population 
curve. In the analysis we distinguish the following two cases: 

(1) Monotonic Q: The population curve u h- > Q(u\x) is increasing in u, and thus satisfies 
the monotonicity requirement. 

(2) Non-monotonic Q: The population curve u >— > Q(u\x) is non-monotone due to misspec- 
ification. 

In case (1) the empirical curve u \— > Q(u\x) may be non-monotone due to estimation error, 
while in case (2) it may be non-monotone due to both misspecification and estimation error. 
A leading example of case (1) is when the population curve Q is correctly specified, so that 
it equals the target quantile curve, namely Q(u\x) = Qq(u\x) for all u € (0,1). Case (1) also 
allows for some degree of misspecification, provided that the population curve, Q ^ Qo, remains 
monotone. A leading example of case (2) is when the population curve Q is misspecified, 
Q Qoi to a degree that makes u i— > Q(u\x) non-monotone. For example, the common linear 
specification it i— > Q(u\x) = p(x) T f3(u) can be non-monotone if the support of X is sufficiently 
rich, while the set of transformations of x, p(x), is not (Koenker, 2005, Chap 2.5). Typically, 
by using a rich enough set p(x) we can approximate the true function Qq(u\x) sufficiently 
well, and thus often avoid case (2). This is the strategy that we generally recommend, since 
inference and limit theory under case (1) is theoretically and practically simpler than under 
case (2). However, in what follows we analyze the behavior of rearranged estimates both in 
cases (1) and (2), since either of these cases could occur in practice. 

In the rest of the section, we establish the empirical properties of the rearranged estimated 
quantile functions and the corresponding distribution functions: 

u i — > Q*(u\x) and y i— > F(y\x), (2-3) 

under cases (1) and (2). 

2.2. Basic Analytical Properties of Population Curves. We start by characterizing cer- 
tain analytical properties of the probability limits or population versions of empirical curves 
(|2.3p . namely 

V >-» p {y\ x ) = !i HQ(u\x) < y}du, ^ ^ 

u i— > Q*(u\x) := F~ 1 (u\x) = inf{y : F(y\x) > u}. 

We need these properties to derive our main limit results stated in the following sections. 

Recall first the following definitions from Milnor (1965). Let g:Wcli->lbea contin- 
uously differentiable function. A point u € U is called a regular point of g if the derivative 
of g at this point does not vanish, i.e., d u g (u) ^ 0, where d u denotes the partial derivative 
operator with respect to u . A point u which is not a regular point is called a critical point. 
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A value y G g (14) is called a regular value of g if g 1 (y) contains only regular points, i.e., if 
Vit G g~ 1 (y), d u g (u) ^ 0. A value y which is not a regular value is called a critical value. 

Define region y x as the support of Y x , and regions 3'A' := {(y,x) : y £ y x ,x G and 
WA' :=U x X. We assume throughout that 34 C J, a compact subset of R, and that x G A?, 
a compact subset of In some applications the curves of interest are not functions of x, or 
we might be interested in a particular value x. In this case, we can take the set X to be a 
singleton X = {x}. 

Assumption 1. (Properties of Q). We maintain the following assumptions on Q throughout 
the paper: 

(a) Q : 14 x X i— > R is a continuously differentiable function in both arguments. 

(b) The number of elements of {u EU \ d u Q(u\x) = 0} is uniformly bounded on x G X . 

Assumption 1(b) implies that, for each x G X, d u Q(u\x) is not zero almost everywhere on 
14 and can switch sign only a bounded number of times. Further, we define y* be the subset 
of regular values of u i— > Q(u\x) in 34, and 3^* := {(y,%) '■ V £ 3^,^ £ Af}. 

We use the following simple example to describe some basic analytical properties of (|2.4p . 
which we state more formally in the proposition given below. Consider the following pseudo- 
quantile function: Q(u) = 5{u + sin(2-7rn)/7r}, which is highly non-monotone in (0,1) and 
therefore fails to be a proper quantile function. The left panel of Figure 1 shows Q together 
with its monotone rearrangement Q* . We see that Q* partially coincides with Q on the areas 
where Q behaves like a proper quantile function, and that Q* is continuous and increasing. 
Note also that 1/3 and 2/3 are the critical points of Q, and 3.04 and 1.96 are the corresponding 
critical values. The right panel of Figure 1 shows the pseudo-distribution function which 
is multi-valued, and the distribution function F = Q*^ 1 induced by sampling from Q. We see 
that F is continuous and does not have point masses. The left panel of Figure 2 shows d u Q* , 
the sparsity function for Q*. We see that the sparsity function is continuous at the Q* _1 -image 
of the regular values of Q and has jumps at the Q* _1 -image of the critical values of Q. The 
right panel of Figure 2 shows d y F, the density function for F. We see that d y F is continuous 
at the regular values of Q and has jumps at the critical values of Q. 

The following proposition states more formally the properties of Q* and F: 

Proposition 1 (Basic properties of F and Q*). The functions y i— > F(y\x) and u *— > Q*(u\x) 
satisfy the following properties, for each x G X : (1) The set of critical values, y x \y^, is finite, 
and fy x \ y , dF(y\x) = 0. (2) For any y G% 

K(y\x) 

F (y\ x ) = ^2 sign{d u Q(u k (y\x)\x)}u k (y\x) + l{d u Q(u K ^ y \ x) (y\x)\x) < 0}, 

k=l 





Figure 1. Left: The pseudo-quantile function Q and the rearranged quantile 
function Q*. Right: The pseudo-distribution function Q^ 1 and the distribution 
function F induced by Q. 




Figure 2. Left: The density (sparsity) function of the rearranged quantile 
function Q*. Right: The density function of the distribution function F induced 
byQ. 
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where {uk(y\x), for k = l,2, ...,K(y\x) < 00} are the roots of Q(u\x) = y in increasing order. 
(3) For any y G y*, the ordinary derivative f(y\x) := d y F(y\x) exists and takes the form 



f(y\*)= £ 



K(y\x) 



£-[ \d u Q(u k (y\x)\x)y 

which is continuous at each y G y*. For any y G y \ y*, we set f{y\x) := 0. F(y\x) is 
absolutely continuous and strictly increasing in y G y x . Moreover, y 1— > f(y\x) is a Radon- 
Nikodym derivative of y 1— > F{y\x) with respect to the Lebesgue measure. (4) The quantile 
function u 1— > Q*(u\x) partially coincides with u 1— > Q{u\x); namely Q*(u\x) = Q(u\x), provided 
that u 1— > Q(u\x) is increasing at u, and the preimage of Q*(u\x) under Q is unique. (5) The 
quantile function u 1— > Q*(u\x) is equivariant to monotone transformations of u 1— > Q(u\x), 
in particular, to location and scale transformations. (6) The quantile function u 1— > Q*(u\x) 
has an ordinary continuous derivative d u Q*{u\x) = 1 / / (Q* (u\x)\x) , when Q*(u\x) G y%. This 
function is also a Radon-Nikodym derivative with respect to the Lebesgue measure. (7) The 
map (y,x) 1— > F(y\x) is continuous on yX and the map (u,x) 1— > Q*(u|x) is continuous on 
UX. 

2.3. Functional Delta Method for Rearrangement-Related Operators. Here we derive 
a functional delta method for the rearrangement operator Q i— > Q* and the pre-rearrangement 
operator Q ^ F defined by equation (|2.4|) . These results constitute the first set of original 
main theoretical results obtained in this paper. In the subsequent sections, these results allow 
us to establish a generic functional central limit theorem for the estimated functions Q* and 
F, as well as to establish validity of the bootstrap for estimating their limit laws. 

In order to describe the results, let £°°(UX) denote the set of bounded and measurable 
functions h : UX t— > R, CilAX) the set of continuous functions h : UX 1— ► R, and l l (UX) the 
set of measurable functions h : UX \— > R such that J x \h{u\x)\dudx < 00, where du and dx 
denote the integration with respect to the Lebesgue measure on U and X , respectively. 

Proposition 2 (Hadamard derivatives of F and Q* with respect to Q). (1) Define F(y\x, ht) '■= 
Jq 1 1{Q(u\x) + th t {u\x) < y}du. As t — > 0, 

D ht (y\x,t) := >D h (y\x), (2.5) 

T/ie convergence holds uniformly in any compact subset of yX* := {(y, x) : y € 3^,x G rY} ; 
for every \h t - h\oo -» 0, w/iere /i t G £°°(WA') ; and /i G C(UX). (2) Define Q*(u\x,h t ) := 
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F- 1 



(y\x,h t ) 



mf{y : F(y\x, h t ) > u}. As t -> 0, 



D ht (u\x,i) : 



Q*(u\x, h t ) - Q*(u\x) 



D h (u\x) 



(2.7) 



t 



D h (u\x) : 



f(n J "i xi x D h (Q*{u\x)\x). 
f(Q*{u\x)\x) 



(2.8) 



The convergence holds uniformly in any compact subset of UX* = {(«, x) : (Q*(u\x),x) G 
yX*}, for every \h t - h\ x -» 0, where h t G and G C(WAf). 

This proposition establishes the Hadamard (compact) differentiability of the rearrangement 
operator Q i— > Q* and the pre-rearrangement operator Q i— > i 7 with respect to Q, tangentially 
to the subspace of continuous functions. Note that the convergence holds uniformly on regions 
that exclude the critical values of the mapping u i— > Q(u\x). These results are new and could 
be of independent interest. Rearrangement operators include inverse (quantile) operators as 
a special case. In this sense, our results generalize the previous results of Gill and Johansen 
(1990), Doss and Gill (1992), and Dudley and Norvaisa (1999) on functional delta method 
(Hadamard differentiability) for the quantile operator. There are two main difficulties in 
establishing the Hadamard differentiability in our case: first, like in the quantile case, we allow 
the perturbations ht to Q to be discontinuous functions, though converging to continuous 
functions; second, unlike in the quantile case, we allow the perturbed functions Q + tht to 
be non-monotone even when Q is monotone. We need to allow for such rich perturbations 
in order to match empirical applications, where empirical perturbations ht = (Q — Q)/t are 
discontinuous functions, though converging to continuous functions by the means of a functional 
central limit theorem; moreover, the empirical (pseudo) quantile functions Q = Q + tht are 
not monotone even when Q is monotone. 

The following result deals with the monotonic case. It is worth emphasizing separately, 
because functional derivatives are particularly simple and we do not have to exclude any non- 
regular regions from the domains. 

Corollary 1 (Hadamard derivatives of F and Q* with respect to Q in the monotonic case). 
Suppose u i— > Q(u\x) has d u Q(u\x) > 0, for each (u,x) G UX . Then yX* = yX and UX* = 
TAX. Therefore, the convergence in Proposition^ holds uniformly over the entire yX andUX, 
respectively. Moreover, Dy l {u\x') = h, i.e., the Hadamard derivative of the rearranged function 
with respect to the original function is the identity operator. 

Next we consider the following linear functionals obtained by integration: 



with the restrictions on g specified below. These functionals are of interest because they 
are useful building blocks for various statistics, for example, Lorenz curves with function 




^ / g(y\x,y')F(y\x)dy, (vf,x)>-> / g(u\x,u')Q* (u\x)du, 
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g(u\x,u') = l{u < u'}, as discussed in the next section. The following proposition calcu- 
lates the Hadamard derivative of these functionals. 

Proposition 3 (Hadamard derivative of linear functionals of Q* and F with respect to Q). 
The following results are true with the limits being continuous on the specified domains: 



1- / g(y\x,y')D ht (y\x,t)dy -> / g{y\x,y')D h (y\x)dy 
Jy Jy 

uniformly in (y',x) € yX, for any measurable g that is is bounded uniformly in its arguments 
and such that (x,y') h-> g(y\x,y') is continuous for a.e. y. 

2. / g(u\x, u')Dh t ( u I x , t)du — > / g(u\x,u')Dh(u\x)du (2-9) 
Ju Ju 

uniformly in (u',x) € IAX, for any measurable g such that sup u / x \g(u\x,u')\ G ^ 1 (W) and such 
that (x,u') i — > g(u\x,u') is continuous for a.e. u. 

It is important to note that Proposition [3] applies to integrals defined over entire domains, 
unlike Proposition [2] which states uniform convergence of integrands over domains excluding 
non-regular neighborhoods. (Thus, Proposition 3 does not immediately follow from Proposition 
2.) Here integration acts like a smoothing operation and allows us to ignore these non-regular 
neighborhoods. In order to prove convergence of integrals defined over entire domains, we 
couple the pointwise convergence implied by Proposition [2] with the uniform integrability of 
Lemma 3 in the Appendix, and then interchange limits and integrals. We should also note 
that an alternative way of proving result (|2.9p . but not other results in the paper, can be based 
on the convexity of the functional in (|2.9p with respect to the underlying curve, following the 
approach of Mossino and Temam (1981), and Alvino et al. (1989). Due to this limitation, we 
do not pursue this approach in this paper. 

It is also worth emphasizing the properties of the following smoothed functionals. For a 
measurable function / : M. i— > K define the smoothing operator S as 

Sf(y') := / k s (y'-y)f(y)dy, (2.10) 



where k$(v) = l{\v\ < 5}/25 and 5 > is a fixed bandwidth. Accordingly, the smoothed 
curves SF and SQ* are given by 

SF{y'\x) := J k s (y' - y)F(y\x)dy, SQ*{u'\x) := J k s (u' - u)Q* (u\x)du. 

Note that given the quantile function Q* , the smoothed function SQ* has a convenient in- 
terpretation of a local average quantile function or fractile. Since we form these curves as 
differences of the elementary functionals in Proposition [3] divided by 25, the following corollary 
is immediate: 
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Corollary 2 (Hadamard derivative of smoothed Q* and F with respect to Q). We have 
that SDh t (y'\x,t) — > SDh(y'\x) uniformly in (y',x) € yX, and SDh t (u'\x,t) — > SDh{u'\x) 
uniformly in (u',x) £ UX . The results hold uniformly in the smoothing parameter 5 € [<5i,#2]> 
where 5\ and 82 are positive constants. 

Note that smoothing allows us to achieve uniform convergence over the entire domain, 
without excluding non-regular neighborhoods. 

2.4. Empirical Properties and Functional Limit Theory for Rearranged Estimators. 

Here we state a finite sample result and then derive functional limit laws for rearranged esti- 
mators. These results constitute the second set of original main theoretical results obtained in 
this paper. 

The following proposition shows that the rearranged quantile curves have smaller estimation 
error than the original curves whenever the latter are not monotone. 

Proposition 4 (Improvement in estimation property provided by rearrangement). Suppose 
that Q is an estimator (not necessarily consistent) for some true quantile curve Qq. Then, the 
rearranged curve Q* is closer to the true curve than Q in the sense that, for each x E X , 

HQ* - Qollp < \\Q - QoWp, p e [l,°o], 

where \\ ■ \\ p denotes the L p norm of a measurable function Q : hi 1— > R, namely \\Q\\ P = 
{Ju \Q{u)\ p du} l / p . The inequality is strict for p S (1, 00) whenever u 1— > Q(u\x) is strictly de- 
creasing on a subset ofli of positive Lebesgue measure, while u 1— > Qq(u\x) is strictly increasing 
on hi. The above property is independent of the sample size and of the way the estimate of the 
curve is obtained, and thus continues to hold in the population. 

This property suggests that the rearranged estimators should be preferred over the original 
estimators. Moreover, this property does not depend on the way the quantile model is estimated 
or any other specifics, and is thus applicable quite generally. Regarding the proof of this 
property, the weak reduction in estimation error follows from an application of a classical 
rearrangement inequality of Lorentz (1953) and the strict reduction follows from its appropriate 
strengthening (Chernozhukov et al., 2006) @ 

The following proposition derives functional limit laws for the rearranged quantile estimator 
Q* and the corresponding distribution estimator F, using the functional delta method for 
the rearrangement-related operators from the previous section. We maintain the following 
assumptions on Q throughout the paper: 

^Similar contractivity properties have been shown for the pool adjacent violators algorithm in different 
contexts. See, for example, Robertson et al. (1988) for isotonic regression, and Eggermont and LaRiccia (2000) 
for monotone density estimation. Glad et al. (2003) shows that a density estimator corrected to be a proper 
density satisfies a similar property. 
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Assumption 2. (Properties of Q). The empirical curve Q takes its values in the space of 
bounded measurable functions defined on UX, and, in £°°(UX), 

a n (Q(u\x) - Q(u\x)) => G(u\x), ( 2 -H) 

as a stochastic process indexed by (u,x) 6 UX, where (u,x) i— > G(u\x) is a stochastic process 
(typically Gaussian) with continuous paths. Here a n is a sequence of constants such that 
a n —* oo as n — * oo, where n is the sample size. 

Condition (|2.1ip requires that the original quantile estimator satisfies a functional central 
limit theorem with a continuous limit stochastic process over the domain U = (0, 1) for the 
index u. If (|2. 1 1 j) holds only over a subinterval of (0, 1), we can accommodate the reduced 
domain following Remark 1. This key condition is rather weak, and it holds for a wide variety 
of conditional and structural quantile estimators!^ With an appropriate normalization rate 
and a fixed x, this assumption also holds for series and local-polynomial quantile regressions^ 

Proposition 5 (Functional limit laws for F and Q*). In £°°(K), where K is any compact 
subset of yX*, 

a n (F(y\x) - F{y\x)) => D G (y\x) (2.12) 

as a stochastic process indexed by (y,x) £ yX*; and in £°°{UX k), with UXx = {(u,x) ■ 
(Q*(u\x),x) £ K}, 

a n (Q*(u\x) - Q*(u\x)) D G (u\x), (2.13) 
as a stochastic process indexed by (u, x) G UXk ■ 

This proposition provides the basis for inference using rearranged quantile estimators and 
corresponding distribution estimators. Let us first discuss inference for the case with a mono- 
tonic population curve Q. Proposition [5] enables us to perform uniform inference on Q and F 
based on the rearranged estimators Q* and F. It is useful to emphasize the following corollary 
of Proposition 5: 

Corollary 3 (Functional limit laws for F and Q* in the monotonic case). Suppose u Q(u\x) 
has d u Q(u\x) > for each (u,x) £ UX. Then yX* = yX and UX* = UX. Accordingly, 
the convergence in Proposition holds uniformly over the entire yX and UX. Moreover, 
D G {u\x) = G{u\x), i.e., the rearranged quantile curves have the same first order asymptotic 
distribution as the original estimated quantile curves. 

^For sufficient conditions, see, for exampie, Gutenbrunner and Jureckova (1992), Portnoy (1991), Angrist et 
al. (2006), and Chernozhukov and Hansen (2006). 

See, for example, Chaudhuri (1991) and He and Shao (2000); Belloni and Chernozhukov (2007) have recently 
extended the results of He and Shao (2000) to the process case and established the functional central limit 
theorem for a n (Q(u\x) — Q(u\x)) for a fixed x. 
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Thus, if the population curve is monotone, we can rearrange the original non-monotone 
quantile estimator to be monotonic without affecting its (first order) asymptotic properties. 
Hence, all the inference tools that apply to the original quantile estimator Q also apply to 
the rearranged quantile estimator Q*. In particular, if the bootstrap is valid for the original 
estimate, it is also valid for the rearranged estimate, by the functional delta method for the 
bootstrap. 

Remark 2. (Detecting and avoiding cases with non- monotone Q.) Before discussing infer- 
ence for the case with a non-monotonic population curve Q, let us first emphasize that since 
non-monotonicity of Q is a rather obvious sign of specification error, it is best to try to detect 
and avoid this case. For this purpose we should use sufficiently flexible functional forms and 
reject the ones that fail to pass monotonicity tests. For example, we can use the following 
generic test of monotonicity for Q: If Q is monotone, the first order behavior of Q* and Q 
coincides, and if Q is not monotone, Q* and Q converge to different probability limits Q* and 
Q. Therefore, we can reject the hypothesis of monotone Q if a uniform confidence region for 
Q based on Q does not contain Q*, for at least one point x £ X% □ 

Let us now discuss inference for the case with a non-monotonic population curve Q. In 
this case, the large sample properties of the rearranged quantile estimators Q* substantially 
differ from those of the initial quantile estimators Q. Proposition [5] still enables us to perform 
uniform inference on the rearranged population curve Q* based on the rearranged estimator 
Q*, but only after excluding certain nonregular neighborhoods (for the distribution estimates, 
the neighborhoods of the critical values of the map u t— > Q(u\x), and, for the rearranged 
quantile estimates, the image of these neighborhoods under F). These neighborhoods can be 
excluded by locating the points (u, x) where a consistent estimate of |9 u Q(w|a;)| is close to zero; 
see Hendricks and Koenker (1991) for a consistent estimator of [9 u Q(u|a;)|. 

Next we consider the following linear functionals of the rearranged quantile and distribution 
estimates: 



The following proposition derives functional limit laws for these functionals □ Here the conver- 
gence results hold without excluding any nonregular neighborhoods, which is convenient for 
practice in the non-monotonic case. 

^This test is conservative, but it is generic and very inexpensive. In order to build non-conservative tests, 
we need to derive the limit laws for \\Q — Q*\\ for suitable norms || • ||. These laws will depend on higher-order 
functional limit laws for quantile estimators, which appear to be non-generic and have to be dealt with on a 
case by case basis. 

^ Working with these functionals is equivalent to placing our empirical processes into the space L p (p = 1 for 
rearranged distributions and p — oo for quantiles), equipped with weak* topology, instead of strong topology. 
Convergence in law of the integral functionals, shown in Proposition |6j is equivalent to the convergence in law 
of the rearranged estimated processes in such a metric space. 
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Proposition 6 (Functional limit laws for linear functionals of Q* and F). Under the same 
restrictions on the function g as in Proposition 0, the following results hold with the limits 
being continuous on the specified domains: 



1. a n / g(y\x,y')(F(y\x)-F(y\x))dy^ / g(y\x,y')D G (y\x)dy, (2.14) 
Jy Jy 

as a stochastic process indexed by (y',x) £ yX, in £°°(yX). 

2. a n / g(u\x, u')(Q*(u\x) — Q*{u\x))du => / g(u\x, u')D G (u\x)du, (2.15) 
Ju ' Ju 

as a stochastic process indexed by (u',x) E UX, in l°°(UX). 

The linear functionals defined above are useful building blocks for various statistics, such 
as partial means, various moments, and Lorenz curves. For example, the conditional Lorenz 
curve based on rearranged quantile functions is 

L{u'\x) := ( j l{u < u'}Q*{u\x)du\/( I Q*(u\x)duj, (2.16) 

J IA JIA 

which is a ratio of partial and overall conditional means. Hadamard differentiability of the 
mapping 

Q ^ L(u'\x) := ( j l{u < u'}Q*(u\x)du)/(^ f Q*(u\x)duj, (2.17) 

with respect to Q immediately follows from (a) the differentiability of a ratio (3/^ with respect 
to its numerator [3 and denominator 7 at 7 7^ 0, (b) Hadamard differentiability of the numerator 
and denominator in (|2.17p with respect to Q established in Proposition [HJ and (c) the chain 
rule for the functional delta method. Hence, provided that Q* (u\x)du 7^ 0, we have that in 
the metric space (.°°{UX) 

it l >\ \ t(>\\\^t('\\ I k Lfe ~ u'}D G {u\x)du J u D G {u\x)du \ 

a n (L(u a?) - L(u \x)) => L(u \x) ■ ¥—j — — f , (2.18) 

\ J u ±{u < u \Q*(u\x)du J u Q*(u\x)du I 

as an empirical process indexed by (u',x) £ UX. In particular, validity of the bootstrap for 
estimating this functional limit law in (|2,18p holds by the functional delta method for the 
bootstrap. 

We next consider the empirical properties of the smoothed curves obtained by applying the 
linear smoothing operator S defined in (|2.10p to F and Q*: 



SF(y'\x) := J ks(y' - y)F(y\x)dy, SQ*(u'\x) := j k$(u' — u)Q*(u\x)du. 
The following corollary immediately follows from Corollary 2 and the functional delta method. 
Corollary 4 (Functional limit laws for smoothed Q* and F). In £°°(yX), 

a n (SF(y'\x) - SF(y'\x)) SD G (y'\x), (2.19) 
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as a stochastic process indexed by (y',x) G yX, and in i°°(JAX), 

a n {SQ*(u'\x) - SQ*(u'\x)) => SD G (u'\x), (2.20) 

as a stochastic process indexed by (u',x) G UX. The results hold uniformly in the smoothing 
parameter 5 G [5i,#2]> where 5± and 62 are positive constants. 

Thus, as in the case of linear functionals, we can perform inference on SQ* based on the 
smoothed rearranged estimates without excluding nonregular neighborhoods, which is conve- 
nient for practice in the non-monotonic case. Furthermore, validity of the bootstrap for the 
smoothed curves follows by the functional delta method for the bootstrap. Lastly, we note 
that it is not possible to simultaneously allow 5 — > and preserve the uniform convergence 
stated in the corollary. 

Our final corollary asserts validity of the bootstrap for inference on rearranged estimators 
and their functionals. This corollary follows from the functional delta method for the bootstrap 
(e.g., Theorem 13.9 in van der Vaart, 1998). 

Corollary 5 (Validity of the bootstrap for estimating laws of rearranged estimators). If 
the bootstrap consistently estimates the functional limit law \2.11\) of the empirical process 
{a n (Q(u\x) — Q(u\x), (u, x) G UX}, then it also consistently estimates the functional limit laws 

{KW, WM), WW , WM), fosp . and $MW - 

3. Examples 

In this section we apply rearrangement to the estimation of structural quantile and distri- 
bution functions. We show how rearrangement monotonizes instrumental quantile and distri- 
bution function estimates, and demonstrate how to perform inference on the target functions 
using the results developed in this paper. Using a supporting numerical example, we show that 
rearranged estimators noticeably improve upon original estimators and also outperform iso- 
tonized estimators. Thus, rearrangement is necessarily preferable to the standard approach of 
simply ignoring non-monotonicity. Moreover, in quantile estimation problems, rearrangement 
is also preferable to the standard approach of isotonization used primarily in mean estimation 
problems. 

3.1. Empirical Example. We consider estimation of the causal/structural effects of Viet- 
nam veteran status X G {0, 1} in the quantiles and distribution of civilian earnings Y. Since 
veteran status is likely to be endogenous relative to potential civilian earnings, we employ 
an instrumental variables approach, using the U.S. draft lottery as an instrument for the 
Vietnam status (Angrist, 1990). We use the same data subset from the Current Population 
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Survey as in Abadie (2002)0 We then estimate structural quantile and distribution func- 
tions with the instrumental quantile regression estimator of Chernozhukov and Hansen (2005, 

2006) and the instrumental distribution regression estimator of Abadie (2002). Under some 
assumptions these procedures consistently estimate the structural quantile and distribution 
functions of interest0 However, like most estimation methods mentioned in the introduction, 
neither of these procedures explicitly imposes monotonicity of the distribution and quantile 
functions. Accordingly, they can produce estimates in finite samples that are nonmonotonic 
due to either sampling variation or violations of instrument independence or other modeling 
assumptions. We monotonize these estimates using rearrangement and perform inference on 
the target structural functions using uniform confidence bands constructed via bootstrap. We 
use the programming language R to implement the procedures (R Development Core Team, 

2007) . We present our estimation and inference results in Figures [3] [5j 

In Figure EJ we show Abadie's estimates of the structural distribution of earnings for vet- 
erans and non- veterans (left panel) as well as their rearrangements (right panel). For both 
veterans and non- veterans, the original estimates of the distributions exhibit clear local non- 
monotonicity. The rearrangement fixes this problem producing increasing estimated distribu- 
tion functions. In Figured! we show Chernozhukov and Hansen's estimates of the structural 
quantile functions of earnings for veterans (left panel) as well as their rearrangements (right 
panel). For both veterans and non- veterans, the estimates of the quantile functions exhibit pro- 
nounced local non-monotonicity. The rearrangement fixes this problem producing increasing 
estimated quantile functions . In the case of quantile functions, the nonmonotonicity problem 
is specially acute for the small sample of veterans. 

In Figure [5j we plot uniform 90% confidence bands for the structural quantile functions of 
earnings for veterans and non-veterans, together with uniform 90% confidence bands for the 
effect of Vietnam veteran status on the quantile functions for earnings, which measures the dif- 
ference between the structural quantile functions for veterans and non-veterans. We construct 
the uniform confidence bands using both the original estimators and the rearranged estimators 
based on 500 bootstrap repetitions and a fine net of quantile indices {0.01,0.02, ...,0.99}. We 
obtain the bands for the rearranged functions assuming that the population structural quantile 
regression functions are monotonic, so that the first order behavior of the rearranged estimators 



These data consist of a sample of white men, born in 1950-1953, from the March Current Population 
Surveys of 1979 and 1981-1985. The data include annual labor earnings, the Vietnam veteran status and an 
indicator on the Vietnam era lottery. There are 11,637 men in the sample, with 2,461 Vietnam veterans and 
3,234 eligible for U.S. military service according to the draft lottery indicator. Abadie (2002) gives additional 
information on the data and the construction of the variables. 

^More specifically, Abadie's (2002) procedure consistently estimates these functions for the subpopulation 
of compilers under instrument independence and monotonicity. Chernozhukov and Hansen's (2005, 2006) ap- 
proach consistently estimates these functions for the entire population under instrument independence and rank 
similarity. 
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Figure 3. Abadie's estimates of the structural distributions of earnings for 
veteran and non- veterans (left panel), and their rearrangements (right panel). 

coincides with the behavior of the original estimators. The figure shows that even for the large 
sample of non-veterans the rearranged estimates lie within the original bands, thus passing our 
automatic test of monotonicity specified in Remark 2. Thus, the lack of monotonicity of the 
estimated quantile functions in this case is likely caused by sampling error. From the figure, 
we conclude that veteran status has a statistically significant negative effect in the lower tail, 
with the bands for the rearranged estimates showing a wider range of quantile indices for which 
this holds. 

3.2. Monte Carlo. We design a Monte Carlo experiment to closely match the previous empir- 
ical example. In particular, we consider a location model, where the outcome is Y = [1, X]a+e, 
the endogenous regressor is X = 1{[1; Z]ir + v > 0}, the instrument Z is a binary random vari- 
able, and the disturbances (e, v) are jointly normal and independent of Z. The true structural 
quantile functions are Qo(u\x) = [l;x]a + Q e {u), x G {0, 1}, where Q e is the quantile function 
of the normal variable e. The corresponding structural distribution functions are the inverse of 
the quantile functions with respect to u. We select the value of the parameters by estimating 
this location model parametrically by maximum likelihood, and then generate samples from 
the estimated model, holding the values of the instruments Z equal to those in the data set 1^ 

12 More specifically, after normalizing the standard deviation of v to one, we set tv — [—.92; .40] T , a = 
[11, 753; — 911] T , the standard deviation of e to 8, 100, and the covariance between e and v to 379. We draw 
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A - Original curves B - Rearranged curves 




0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 

Quantile index Quantile index 

Figure 4. Chernozhukov and Hansen's estimates of the structural quantile 
functions of earnings for veterans (left panel), and their rearrangements (right 
panel) . 



We use the estimators for the structural distribution and quantile functions described in the 
previous section. We monotonize the estimates using either rearrangement or isotonization. 
We use isotonization as a benchmark since it is the standard approach in mean regression prob- 
lems (Mammen, 1991); it amounts to projecting the estimated function on the set of monotone 
functions. 

Table 1 reports ratios of estimation errors of the rearranged and isotonized estimates to those 
of the original estimates, recorded in percentage terms. The target functions are the structural 
distribution and quantile functions. We measure estimation errors using the average L p norms 
|| ■ ||p with p = 1,2, and oo, and we compute them as Monte Carlo averages of ||/o — /|| p , where 
fo is the target function, and / is either the original or rearranged or isotonized estimate of 
this function. 

We find that the rearranged estimators noticeably outperform the original estimators, achiev- 
ing a reduction in estimation error up to 14%, depending on the target function and the norm. 
Moreover, in this case the better approximation of the rearranged estimates to the structural 



5, 000 Monte Carlo samples of size n = 11, 627. We generate the values of Y and X by drawing disturbances 
(e, v) from a bivariate normal distribution with zero mean and the estimated covariance matrix. 
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Figure 5. Simultaneous 90% confidence bands for structural quantile func- 
tions of earnings and structural quantile effect of Vietnam veteran status on 
earnings. The bands for the quantile functions are intersected with the class of 
monotone functions. 



functions also produces more accurate estimates of the distribution and quantile effects, achiev- 
ing a 3% to 9% reduction in estimation error for the distribution estimator and a 3% to 14% 
reduction in estimation error for the quantile estimator, depending on the norm. 

We also find that the rearranged estimators noticeably outperform the isotonized estimators, 
achieving up to a further 4% reduction in estimation error, depending on the target function 
and the norm. The reason is that isotonization projects the original fitted function on the 
set of monotone functions, finding the flattest fit in this set. In contrast, rearrangement sorts 
the original fitted function, finding the steepest fit that preserves measure. In the context 
of estimating quantile and distribution functions, the target functions tend to be non-flat, 
suggesting that rearrangement should be typically preferred over isotonizationF^I 

^To give some intuition about this point, it is instructive to consider a simple example with a two-point 
domain {0, 1}. Suppose that the target function /o : {0, 1} — ► R is increasing, and steep, namely /o(0) > /o(l), 
and the fitted function / : {0, 1} — » R is decreasing, with /(0) > /(l). In this case, isotonization produces 
a nondecreasing function / : {0, 1} —* R, which is flat, with /(0) = /(l) = [/(0) + /(l)]/2, and is somewhat 
unsatisfactory. In such cases rearrangement can significantly outperform isotonization, since it produces the 
steepest fit, namely it produces /* : {0, 1} — > R with /*(0) = f(l) < /*(1) = /(0). This observation provides a 
simple theoretical underpinning for the estimation results we see in Table 1. 
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Table 1. Ratios of estimation error of rearranged and isotonic estimators to 
those of original estimators, in percentage terms. 
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4. Conclusion 

This paper develops a monotonization procedure for estimation of conditional and structural 
quantile and distribution functions based on rearrangement-related operations. Starting from 
a possibly non-monotone empirical curve, the procedure produces a rearranged curve that 
not only satisfies the natural monotonicity requirement, but also has smaller estimation error 
than the original curve. We derive asymptotic distribution theory for the rearranged curves, 
and illustrate the usefulness of the approach with an empirical application and a simulation 
example. There are many more potential applications of the results of the paper to other 
econometric problems with shape restrictions (see e.g. Matzkin, 1994, and Chernozhukov et 
al., 2006). 

Appendix A. Proofs 

A.l. Proof of Proposition 1. First, note that the distribution of Y x has no atoms, i.e., 

Pt[Y x =y]= Pt[Q(U\x) =y]= Pv[U € {u € U : u is a root of Q(u\x) = y}] = 0, 

since the number of roots of Q(u\x) = y is finite under (a) - (b), and U ~ Uniform(W). Next, 
by assumptions (a)-(b) the number of critical values of Q(u\x) is finite, hence claim (1) follows. 
Next, for any regular y, we can write F(y\x) as 

,1 K(yW-l fUk+l{y \ x) fl 

/ l{Q(u\x) < y}du = 22 / l{Q(u\x) < y}du + / l{Q(u\x) < y}du, 

where uo(y\x) := and {uk(y\x), for k = 1, 2, K(y\x) < oo} are the roots of Q{u\x) = 
y in increasing order. Note that the sign of d u Q(u\x) alternates over consecutive Uk{y\x), 
determining whether l{Q(y\x) < y} = 1 on the interval [uk-i(y\x), Uk(y\x)]. Hence the first 
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term in the previous expression simplifies to Y^k=n 1 ^-{duQ{uk+i{y\x)\x) > 0}(uk+i(y\x) — 
Uk(y\x)); while the last term simplifies to l{d u Q{u K ^ y \ x ^{y\x)\x) < 0}(1 — uj(f y \ x \(y\x)). An 
additional simplification yields the expression given in claim (2) of the proposition. 

The proof of claim (3) follows by taking the derivative of expression in claim (2), noting that 
at any regular value y the number of solutions K(y\x) and sign(d u Q(uk(y\x)\x)) are locally 
constant; moreover, 

sign(d u Q(u k (y\x)\x)) 



dyU k (y\x) 



\duQ{u k {y\x)\x)\ 



Combining these facts we get the expression for the derivative given in claim (3). 

To show the absolute continuity of F with / being the Radon-Nykodym derivative, it suffices 
to show that for each y' G y x , j \y\x)dy = j y _ oQ dF{y\x), cf. Theorem 31.8 in Billingsley 
(1995). Let Vf be the union of closed balls of radius t centered on the critical points y x \y*, and 
define y* = y x \Vf. Then, l{y G y x }f{y\x)dy = l{y G y x }dF{y\x). Since the set of 
critical points y x \y x is finite and has mass zero under F, J_ l{y G y x }dF(y\x) f dF(y\x) 
as t -> 0. Therefore, l{y G y^} f {y\x)dy T f(y\x)dy = dF{y\x). 

Claim (4) follows by noting that at the regions where s — > Q(s\x) is increasing and one-to- 
one, we have that F(y\x) = f Q ( s \ x )< y ds = f s <Q-i( y \ x )ds = Q~ l {y\x). Inverting the equation 
u = F{Q*{u\x)\x) = Q~ 1 {Q*{u\x)\x) yields Q*(u\x) = Q{u\x). 

Claim (5). We have Y x = Q(U\x) has quantile function Q*. A quantile function is known 
to be equivariant to monotone increasing transformations, including location-scale transforma- 
tions. Thus, this is true in particular for Q*. 

Claim (6) is immediate from claim (3). 

Claim (7). The proof of continuity of F is subsumed in the step 1 of the proof of Proposition 
3 (see below). Therefore, for any sequence xt — > x we have that F(y\xt) — > F{y\x) uniformly 
in y, and F is continuous. Let ut — » u and x t — > x. Since F(y\x) = u has a unique root 
y = Q*( u \ x ), the root of F(y\xt) = u t , i.e., yt = Q*{ut\xt), converges to y by a standard 
argument, see, e.g., van der Vaart and Wellner (1997). □ 



A. 2. Proof of Propositions [2]-[6l In the proofs that follow we will repeatedly use Lemma 1, 
which establishes the equivalence of continuous convergence and uniform convergence: 

Lemma 1. Let ID and B' be complete separable metric spaces, with D compact. Suppose 
f : B i— > W is continuous. Then a sequence of functions /„ : B h i' converges to f uniformly 
on B if and only if for any convergent sequence x n — > x in B we have that f n (x n ) — ► f(x). 

Proof of Lemma 1: See, for example, Resnick (1987), page 2. □ 
Proof of Proposition [2j 



23 



Part 1. We have that for any 5 > 0, there exists e > such that for u € B t (uk(y\x)) and 
for small enough t > 

l{Q{u\x) + th t (u\x) <y}< l{Q(u\x) + £(/i(-u fc (y|x)|x) - 6) < y}, 

for all k £ {1,2, whereas for all ti Li k B e (u k (y\x)), as i — > 0, 

l{Q(u|s) +i/t t (u | x) <y} = l{Q(u\x) < y}. 



Therefore, 



fp 1 1{Q(u\x) + t/i^ulx) < y}du - J Q l{Q(u\x) < y}du 



t 

< ^ /■ #jj + t(/i(n fc (y|x)[x) - 6) < y} - l{Q(u\x) < y} ^ 

k=l J B e( U k 



(A.l) 



which by the change of variable y' = Q{u\x) is equal to 

K(y\x) 



where J k is the image of B t {u k (y\x)) under u t— > Q(-|x). The change of variable is possible 
because for e small enough, Q(-|x) is one-to-one between B t (u k (y\x)) and J^. 

Fixing e > 0, for t — > 0, we have that J k C\ [y, y — t(h(uk{y\x)\x) — 5)] = [y , y — t(h(uk(y\x)\x) — 
5)], and \d u Q(Q~ 1 (y'\x)\x)\ —> \d u Q(uk(y\x)\x)\ as Q~ 1 (y'\x) —> u k (y\x). Therefore, the right 
hand term in (lA.lj) is no greater than 

-h(u k (y\x)\x) + 5 
t{ \d u Q{u k {y\x)\x)\ 1 >■ 

Similarly X^i'^ \d^Q(u^(y\x) \ x)\ + (^) bounds (|A.ip from below. Since 5 > can be made 
arbitrarily small, the result follows. 

To show that the result holds uniformly in (y,x) € K, a compact subset of yX*, we use 
Lemma 1. Take a sequence of (yt, xt) in K that converges to (y, x) 6 K, then the preceding ar- 
gument applies to this sequence, since (1) the function (y,x) i— ► — /i(«fc(y|x)|x)/|<9 u (5(nfc(y|x)|x)| 
is uniformly continuous on if, and (2) the function (y,x) \—* K(y\x) is uniformly continuous 
on K. To see (2), note that K excludes a neighborhood of critical points (y \ y*,x € X), 
and therefore can be expressed as the union of a finite number of compact sets [K\, ...,Km) 
such that the function K(y\x) is constant over each of these sets, i.e., K{y\x) = kj for some 
integer kj > 0, for all (y,x) € Kj and j £ {1, M}. Likewise, (1) follows by noting that 
the limit expression for the derivative is continuous on each of the sets (Ki, Km) by the 
assumed continuity of h(u\x) in both arguments, continuity of u k (y\x) (implied by the Implicit 
Function Theorem), and the assumed continuity of d u Q(u\x) in both arguments. □ 
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Part 2. For a fixed x the result follows by Part 1 of Proposition [2j by step 1 of the proof 
below, and by an application of the Hadamard differentiability of the quantile operator shown 
by Doss and Gill (1992). Step 2 establishes uniformity over x G X. 

Step 1. Let K be a compact subset of yX* . Let (yt, x t ) be a sequence in K, convergent to a 
point, say (y,x). Then, for every such sequence, e t := t||Moo + ||Q(-|£t)-Q(-|aOI|oo + |y<-y| — > 
0, and 

\F(y t \x t ,h t ) - F(y\x)\ < I [l{Q(u\x t ) + th t (u\x) < y t } - l{Q(u\x) < y}]du 



o 



< 



/ l{\Q(u\x) - y\ < e t }du 
Jo 



0, (A.2) 



where the last step follows from the absolute continuity of y i— > ,F(y|;c), the distribution function 
of Q(U\x). By setting ht = the above argument also verifies that i^ylx) is continuous in 
(y, x). Lemma 1 implies uniform convergence of F(y\x, h t ) to F(y\x), which in turn implies by 
a standard argument^ the uniform convergence of quantiles Q*(u\x, ht) — > Q*(u\x), uniformly 
over K* , where K* is any compact subset of UX* . 
Step 2. We have that uniformly over K* , 

F(Q*(u\x,h t )\x,ht)-F(Q*(u\x,ht)\x) = ^ + o(1)> ^ 

= J D /l (Q*(n|x)|x)+o(l), 

using Step 1, Proposition 2, and the continuity properties of D^(y|x). Further, uniformly over 
K* , by Taylor expansion and Proposition 1, as i — > 0, 

F(Q-(nix,^)|x)-F(Q-(nix)ix) = /(yHg)|g) Hf, ^ ~ W + o(1)> (A4) 

and (as will be shown below) 

^(Q^nlx,^)^,^) -F(Q*(n|x)|x) 



o(l), (A.5) 



as t — > 0. Observe that the left hand side of ()A.5P equals that of ()A.4p plus that of (|A.3|) . The 
result then follows. 

It only remains to show that equation (|A.5p holds uniformly in K * . Note that for any right- 
continuous cdf F, we have that u < F(Q*(u)) < u + F(Q*(u)) - F(Q*(u)-), where F(— ) 
denotes the left limit of F, i.e., F(xq— ) = lim^^g F(x). For any continuous, strictly increasing 



See, e.g., Lemma 1 in Chernozhukov and Fernandez- Val (2005). 
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cdf F, we have that F(Q*(u)) = u. Therefore, write 

F(Q*(u\x,h t )\x,ht) -F(Q*(u\x)\x) 



< 



< 



< 



t 

u + F(Q*(u\x, h t )\x, ht) - F(Q*(u\x, h t ) - \x, h t ) - u 

t 

F(Q*(u\x, h t )\x, h t ) - F(Q*(u\x, h t ) - \x, h t ) 



t 

(ij [F(Q*(u\x, h t )\x, h t ) - F(Q*(u\x, h t )\x)} 
t 

[F(Q*(u\x,h t ) - \x,h) ~ F(Q*(u\x,h t ) - \x)} 
t 

( = } D h (Q*(u\x,h t )\x) - D h {Q*(u\x,h t ) - \x)+o(l) = o(l), 

as t — > 0, where in (1) we use that F(Q*(u\x,h t )\x) = F(Q*(u\x,h t ) — \x) since F(y\x) is 
continuous and strictly increasing in y, and in (2) we use Proposition 2. □ 

The following lemma, due to Pratt (1960), will be very useful to prove Proposition 4. 

Lemma 2. Let \f n \ < G n and suppose that f n —*f and G n — > G almost everywhere, then if 
J G n ^ j'G finite, then J / n ->• / /. 

Proof of Lemma 2. See Pratt (1960). □ 

Lemma 3 (Boundedness and Integrability Properties). Under the hypotheses of Proposition 
[H we have that, for all (u, x) 6 UX , 

\D ht (u\x,t)\ < ||/h||oo, (A.6) 

and, for all (y, x) € yX, 

\D hM *,t)\ < = r mm^^MM^ ^ (A . 7) 



where for any Xt — > x € X, as t — > 0, 

A(y\x t ,t) -> 2||/i|| 00 /(y|x) for a.e y G y and / A(y\x t ,t)dy -> / 2||/i|| 00 /(y|x)dy. 

Proof of Lemma 3. To show ()A.6P note that 

sup l-D/^ujxjt)! < 1 1 /it 1 1 oo (A.8) 

(u,a:)eWAr 

immediately follows from the equivariance property noted in Claim (5) of Proposition 1. 

The inequality (|A.7[) is trivial. That for any xt — > x G X, A(y\xt,t) — > 2||/i|| 00 /(y|x) for 
a.e y £ y follows by applying Proposition 2 respectively with functions h' t (u\x) = \\ht\\oo an d 
h' t (u\x) = — \\ht\\oo (f° r the case when f(y\x) > 0; and trivially otherwise). Similarly, that for 
any y t — > y G y, A(y t \x,t) — > 2||/i|| OC) /(y|x) for a.e x G X follows by Proposition 2 (for the 
case when f(y\x) > 0; and trivially otherwise) . 
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Further, by Fubini's Theorem, 



A(y\x t ,t)dy = ("(I »hiiW, y ) du . (A . 9) 



y jo \Jy 



=■■ ft(u) 

Note that ft(u) < 2 1 1 /i t 1 1 . Moreover, for almost every u, ft(u) = 2\\ht\\oo f° r small enough t, 
and 211/ttUoo converges to 2|| /iHqo as t — > 0. Then, trivially, 2 \\ht\\oodu — ► 2||/i|[oo. By Lemma 
2 the right hand side of (|A.9P converges to 2 1| /illoo . □ 

A. 3. Proof of Proposition [3l Define mt(y\x,y') := g(y\x,y')D/ lt (y\x,t) and m(y\x,y') := 
g(y\x, y')Dh{y\x). To show claim (1), we need to demonstrate that for any y' t — > y' and xt^x 

m t (y\xt,y' t )dy -> / m(y\x,y')dy, (A.10) 

y jy 

and that the limit is continuous in (x,y'). We have that j/t)| is bounded, for some 

constant C, by CA(y|xt, t) which converges a.e. and the integral of which converges to a finite 
number by Lemma 3. Moreover, by Proposition 2, for almost every y we have mt(y\xt,y' t ) — > 
m(y\x,y'). We conclude that (|A.1Q[) holds by Lemma 2. 

In order to check continuity, we need to show that for any y' t — > y' and xt — > x 

m(y\x t ,y' t )dy -> / m(y\x,y')dy. (A. 11) 

We have that m(y\xt,y' t ) — > m(y\x,y') for almost every y. Moreover, m(y\xt,yt) is dominated 
by 211^1100 ||/t||oo/(?/|:c t ), which converges to 211^1100 H/il^/^lx) for almost every y, and, more- 
over, Jy llffllooll^lloo/Cyl^)^ converges to ||g||oo||^||oo- Conclude that (|A. 1 1[) holds by Lemma 
2. 

To show claim (2), define mt(u\x, v!) = g(u\x, u')Dh t (u\x) and m(u\x, v!) = g(u\x, u')Dh(u\x). 
Here we need to show that for any u' t — > u' and xt — > x 

mt{u\xt,v! t )du — > / m(w|ar, u')du, (A-12) 
w Ju 

and that the limit is continuous in (v! , x). We have that mt(u\xt,u' t ) is bounded by <?(u|:ct)||^||oo, 
which converges to <?(u|:c)||/t||oo for a.e. u. Furthermore, the integral of <7(ii|xt)||/it||oo converges 
to the integral of p(it|x)||/i||oo by the dominated convergence theorem. Moreover, by Proposi- 
tion 2, we have that rrit(u\xt,u' t ) — ► m(u\x,u') for almost every u. We conclude that (|A.12j) 
holds by Lemma 2. 

In order to check the continuity of the limit, we need to show that for any u' t — ► u' and 

xt — ► x 

/ m(u\xt, u' t )du — > / m(u\x,u')du. (A. 13) 

Ju Ju 

We have that m(u\xt,u' t ) — > m(u|x,u') for almost every u. Moreover, for small enough i, 
m(u\xt,u' t ) is dominated by |p(it|xf,it^)|||/i||oo, which converges for almost every value of u to 
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|g(ii|x,u') | Halloo as t — > 0. Furthermore, the integral of |^(u|a?t, u' t )\ ||/t||oo converges to the 
integral of ^(ular, u')\ ||/t||oo by the dominated convergence theorem. We conclude that (|A.13P 
holds by Lemma 2. □ 

A. 4. Proof of Proposition [5], This proposition simply follows by the functional delta method 
(e.g., van der Vaart, 1998). Instead of restating what this method is, it takes less space to 
simply recall the proof in the current context. 

To show the first part, consider the map g n (y,x\h) = a n (F {y\x , h/ 'a n ) — F(y\x)). The 
sequence of maps satisfies g n '(y, x\h n >) — > Dh(y\x) in £°°(K) for every subsequence h n < — > h in 
£°°(UX), where h is continuous. It follows by the extended continuous mapping theorem that, 
in £°°(K), g n (y,x\a n (Q(u\x) — Q(u\x))) Dc(y\x) as a stochastic process indexed by (y,x), 
since a n (Q(u\x) - Q(u\x)) => G(u\x) in £°°(UX). 

Conclude similarly for the second part. □ 

A. 5. Proof of Proposition [6], This follows by the functional delta method, similarly to the 
proof of Proposition □ 
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